Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models

نویسندگان

Bradly C. Stadie

Sergey Levine

Pieter Abbeel

چکیده

Achieving efficient and scalable exploration in complex domains poses a major challenge in reinforcement learning. While Bayesian and PAC-MDP approaches to the exploration problem offer strong formal guarantees, they are often impractical in higher dimensions due to their reliance on enumerating the state-action space. Hence, exploration in complex domains is often performed with simple epsilon-greedy methods. To achieve more efficient exploration, we develop a method for assigning exploration bonuses based on a concurrently learned model of the system dynamics. By parameterizing our learned model with a neural network, we are able to develop a scalable and efficient approach to exploration bonuses that can be applied to tasks with complex, high-dimensional state spaces. We demonstrate our approach on the task of learning to play Atari games from raw pixel inputs. In this domain, our method offers substantial improvements in exploration efficiency when compared with the standard epsilon greedy approach. As a result of our improved exploration strategy, we are able to achieve state-of-the-art results on several games that pose a major challenge for prior methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EX2: Exploration with Exemplar Models for Deep Reinforcement Learning

Deep reinforcement learning algorithms have been shown to learn complex tasks using highly general policy classes. However, sparse reward problems remain a significant challenge. Exploration methods based on novelty detection have been particularly successful in such settings but typically require generative or predictive models of the observations, which can be difficult to train when the obse...

متن کامل

Exploration for Multi-task Reinforcement Learning with Deep Generative Models

Exploration in multi-task reinforcement learning is critical in training agents to deduce the underlying MDP. Many of the existing exploration frameworks such as E, Rmax, Thompson sampling assume a single stationary MDP and are not suitable for system identification in the multi-task setting. We present a novel method to facilitate exploration in multi-task reinforcement learning using deep gen...

متن کامل

Deep Reinforcement Learning for De-Novo Drug Design

We propose a novel computational strategy based on deep and reinforcement learning techniques for de-novo design of molecules with desired properties. This strategy integrates two deep neural networks – generative and predictive – that are trained separately but employed jointly to generate novel chemical structures with the desired properties. Generative models are trained to produce chemicall...

متن کامل

A Study of Qualitative Knowledge-Based Exploration for Continuous Deep Reinforcement Learning

As an important method to solve sequential decisionmaking problems, reinforcement learning learns the policy of tasks through the interaction with environment. But it has difficulties scaling to largescale problems. One of the reasons is the exploration and exploitation dilemma which may lead to inefficient learning. We present an approach that addresses this shortcoming by introducing qualitat...

متن کامل

Towards Cognitive Exploration through Deep Reinforcement Learning for Mobile Robots

Exploration in an unknown environment is the core functionality for mobile robots. Learning-based exploration methods, including convolutional neural networks, provide excellent strategies without human-designed logic for the feature extraction [1]. But the conventional supervised learning algorithms cost lots of efforts on the labeling work of datasets inevitably. Scenes not included in the tr...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1507.00814 شماره

صفحات -

تاریخ انتشار 2015

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models

نویسندگان

چکیده

منابع مشابه

EX2: Exploration with Exemplar Models for Deep Reinforcement Learning

Exploration for Multi-task Reinforcement Learning with Deep Generative Models

Deep Reinforcement Learning for De-Novo Drug Design

A Study of Qualitative Knowledge-Based Exploration for Continuous Deep Reinforcement Learning

Towards Cognitive Exploration through Deep Reinforcement Learning for Mobile Robots

عنوان ژورنال:

اشتراک گذاری